Search Results: "David Bremner"

Before I discovered you could just point your browser at http://search.cpan.org/meta/Dist-Name-0.007/META.json to automagically convert META.yml and META.json, I wrote a script to do it.
Anyway, it goes with my "I hate the cloud" prejudices :).

use CPAN::Meta;
use CPAN::Meta::Converter;
use Data::Dumper;
my $meta = CPAN::Meta->load_file("META.yml");
my $cmc = CPAN::Meta::Converter->new($meta);
my $new=CPAN::Meta->new($cmc->convert(version=>"2"));
$new->save("META.json");

The following developers got their Debian accounts in the last month:

Jeremie Corbier (jcorbier)
John Sullivan (johns)
Muammar El Khatib (muammar)
Heiko Stuebner (mmind)
David Bremner (bremner)
Niels Thykier (nthykier)
Julien Valroff (julien)

Congratulations!

It turns out that pdfedit is pretty good at extracting text from pdf files. Here is a script I wrote to do that in batch mode.

#!/bin/sh
# Print the text from a pdf document on stdout
# Copyright: (c) 2006-2010 PDFedit team  <http://sourceforge.net/projects/pdfedit>
# Copyright: (c) 2010, David Bremner <david@tethera.net>
# Licensed under version 2 or later of the GNU GPL
set -e
if [ $# -lt 1 ]; then
    echo usage: $0 file [pageSep]
    exit 1
fi
#!/bin/sh
# Print the text from a pdf document on stdout
# Copyright:   2006-2010 PDFedit team  <http://sourceforge.net/projects/pdfedit>
# Copyright:   2010, David Bremner <david@tethera.net>
# Licensed under version 2 or later of the GNU GPL
set -e
if [ $# -lt 1 ]; then
    echo usage: $0 file [pageSep]
    exit 1
fi
/usr/bin/pdfedit -console -eval '
function onConsoleStart()  
    var inName = takeParameter();
    var pageSep = takeParameter();
    var doc = loadPdf(inName,false);

    pages=doc.getPageCount();
    for (i=1;i<=pages;i++)  
        pg=doc.getPage(i);
        text=pg.getText();
        print(text);
        print("\n");
        print(pageSep);
     
 
' $1 $2

Yeah, I wish #!/usr/bin/pdfedit worked too. Thanks to Aaron M Ucko for pointing out that -eval could replace the use of a temporary file. Oh, and pdfedit will be even better when the authors release a new version that fixes truncating wide text

Dear Julien; After using notmuch for a while, I came to the conclusion that tags are mostly irelevant. What is a game changer for me is fast global search. And yes, I changed from using dovecot search, so I mean much faster than that. Actually I remember that from the Human Computer Interface course that I took in the early Neolithic era that speed of response has been measured as a key factor in interfaces, so maybe it isn't just me. Of course there are tradeoffs, some of which you mention. David

Having swapped details with many, many people at Debconf, and then been away for a week after that, I found myself with an overflowing mailbox and a long task of open mail, provide pass-phrase, pipe to gpg import . I wanted a way to batch-import all these signatures (there are three times as many, because my key has three UIDs) in one or two goes, and tidy up the stragglers later. David Bremner wrote a small Perl script to do this from an mbox file, but I wanted to work in pure shell and with mutt. Just shoving the mbox at gpg resulted in it decrypting one message, then bailing at the fact the IDEA plugin is not present. Here was my eventual workflow, which only requires you to provide the pass-phrase once:

create a maildir, either with maildir-make or a directory with cur, new and tmp directories nested inside;
mark all relevant messages as read, and save them to here (it doesn t matter if others get caught up in it);
now change to the maildir /cur directory, and run the following bash (disclaimer: totally untested and used at your own risk):
for a in ls ; do mv $a $a.gpg; done gpg --decrypt-files *.gpg rm *.gpg gpg --import * rm *

I expect there are better/quicker/safer ways to do it, but this worked well for me at midnight on a Monday evening. 19/08/10: Yes, it turns out I am a numpty, and Mutt can handle this all by itself with Ctrl-K and a tagged list. This is still quite handy when the private key is not on the machine you re using to read mail, though. Thanks for the corrections.

What is it? I was a bit daunted by the number of mails from people signing my gpg keys at debconf, so I wrote a script to mass process them. The workflow, for those of you using notmuch is as follows:

$ notmuch show --format=mbox tag:keysign > sigs.mbox
$ ffac sigs.mbox

where previously I have tagged keysigning emails as "keysign" if I want to import them. You also need to run gpg-agent, since I was too lazy/scared to deal with passphrases. This will import them into a keyring in ~/.ffac; uploading is still manual using something like

$ gpg --homedir=$HOME/.ffac --send-keys $keyid

UPDATE Before you upload all of those shiny signatures, you might want to use the included script fetch-sig-keys to add the corresponding keys to the temporary keyring in ~/.ffac. After

$ fetch-sig-keys $keyid

then

$ gpg --homedir ~/.ffac --list-sigs $keyid

should have a UID associated with each signature. How do I use it At the moment this is has been tested once or twice by one person. More testing would be great, but be warned this is pre-release software until you can install it with apt-get.

Get the script from $ git clone git://pivot.cs.unb.ca/git/ffac.git
Get a patched version of Mail::GnuPG that supports gpg-agent; hopefully this will make it upstream, but for now, $ git clone git://pivot.cs.unb.ca/git/mail-gnupg.git

I have a patched version of the debian package that I could make available if there was interest.

Install the other dependencies. # apt-get install libmime-parser-perl libemail-folder-perl

Science and Math Track at DebConf10 This year's DebConf10 (which is great, by the way) at Columbia University, New York will feature Tracks for the first time. We had a Community Outreach track on Debian Day (to be continued by more awesome talks over the rest of the week), a Java track on Monday and an Enterprise track yesterday. Tomorrow, Thursday afternoon, the S cience and Math track (which I am organizing) will take place in the Interschool lab on level 7 of Schapiro Center. The Track will start at 14:00 with a short welcome from me, followed by presentations of debian-science by Sylvestre Ledru and debian-math by David Bremner. At 15:00, Michael Hanke and Yaroslav Halchenko will present their talk on "Debian as the ultimate platform for neuroimaging research". This will be followed at 16:00 by three mini-talks on "New developments in Science Packaging". Adam C. Powell, IV will talk about MPI, Sylvestre Ledru will present linear algebra implementations in Debian and finally Michael Hanke and Yaroslav Halchenko will discuss the citation/reference infrastructure. At the end of track, the annual debian-science round-table will happen at 17:00, where David Bremner (mathematics), Michael Hanke (neuro-debian), Sylvestre Ledru (debian- science/pkg-scicomp), Adam C. Powell, IV (debian- science/pkg-scicomp) and myself (debichem) will discuss matters about cross-field debian-science and math related topics. If afterwards there are still outstanding matters to be discussed, we can schedule ad-hoc sessions for science or math matters on Friday or Saturday. See you at the science track tomorrow!

racket (previously known as plt-scheme) is an interpreter/JIT-compiler/development environment with about 6 years of subversion history in a converted git repo. Debian packaging has been done in subversion, with only the contents of ./debian in version control. I wanted to merge these into a single git repository. The first step is to create a repo and fetch the relevant history.

TMPDIR=/var/tmp
export TMPDIR
ME= readlink -f $0 
AUTHORS= dirname $ME /authors
mkdir racket && cd racket && git init
git remote add racket git://git.racket-lang.org/plt
git fetch --tags racket
git config  merge.renameLimit 10000
git svn init  --stdlayout svn://svn.debian.org/svn/pkg-plt-scheme/plt-scheme/
git svn fetch -A$AUTHORS
git branch debian

A couple points to note:

At some point there were huge numbers of renames when then the project renamed itself, hense the setting for merge.renameLimit
Note the use of an authors file to make sure the author names and emails are reasonable in the imported history.
git svn creates a branch master, which we will eventually forcibly overwrite; we stash that branch as debian for later use.

Now a couple complications arose about upstream's git repo.

Upstream releases seperate source tarballs for unix, mac, and windows. Each of these is constructed by deleting a large number of files from version control, and occasionally some last minute fiddling with README files and so on.
The history of the release tags is not completely linear. For example,

rocinante:~/projects/racket  (git-svn)-[master]-% git diff --shortstat v4.2.4  git merge-base v4.2.4 v5.0 
 48 files changed, 242 insertions(+), 393 deletions(-)
rocinante:~/projects/racket  (git-svn)-[master]-% git diff --shortstat v4.2.1  git merge-base v4.2.1 v4.2.4 
 76 files changed, 642 insertions(+), 1485 deletions(-)

The combination made my straight forward attempt at constructing a history synched with release tarballs generate many conflicts. I ended up importing each tarball on a temporary branch, and the merges went smoother. Note also the use of "git merge -s recursive -X theirs" to resolve conflicts in favour of the new upstream version. The repetitive bits of the merge are collected as shell functions.

import_tgz()  
    if [ -f $1 ]; then
        git clean -fxd;
        git ls-files -z   xargs -0 rm -f;
        tar --strip-components=1 -zxvf $1 ;
        git add -A;
        git commit -m'Importing ' basename $1 ;
    else
        echo "missing tarball $1";
    fi;
 
do_merge()  
    version=$1
    git checkout -b v$version-tarball v$version
    import_tgz ../plt-scheme_$version.orig.tar.gz
    git checkout upstream
    git merge -s recursive -X theirs v$version-tarball
 
post_merge()  
    version=$1
    git tag -f upstream/$version
    pristine-tar commit ../plt-scheme_$version.orig.tar.gz
    git branch -d v$version-tarball

The entire merge script is here. A typical step looks like

do_merge 5.0
git rm collects/tests/stepper/automatic-tests.ss
git add  git status -s   egrep ^UA   cut -f2 -d' ' 
git checkout v5.0-tarball doc/release-notes/teachpack/HISTORY.txt
git rm readme.txt
git add  collects/tests/web-server/info.rkt
git commit -m'Resolve conflicts from new upstream version 5.0'
post_merge 5.0

Finally, we have the comparatively easy task of merging the upstream and Debian branches. In one or two places git was confused by all of the copying and renaming of files and I had to manually fix things up with git rm.

cd racket   /bin/true
set -e
git checkout debian
git tag -f packaging/4.0.1-2  git svn find-rev r98 
git tag -f packaging/4.2.1-1  git svn find-rev r113 
git tag -f packaging/4.2.4-2  git svn find-rev r126 
git branch -f  master upstream/4.0.1
git checkout master
git merge packaging/4.0.1-2
git tag -f debian/4.0.1-2
git merge upstream/4.2.1
git merge packaging/4.2.1-1
git tag -f debian/4.2.1-1
git merge upstream/4.2.4
git merge packaging/4.2.4-2
git rm collects/tests/stxclass/more-tests.ss && git commit -m'fix false rename detection'
git tag -f debian/4.2.4-2
git merge -s recursive -X theirs upstream/5.0
git rm collects/tests/web-server/info.rkt
git commit -m 'Merge upstream 5.0'

I'm thinking about distributed issue tracking systems that play nice with git. I don't care about other version control systems anymore :). I also prefer command line interfaces, because as commentators on the blog have mentioned, I'm a Luddite (in the imprecise, slang sense). So far I have found a few projects, and tried to guess how much of a going concern they are. Git Specific

ticgit I don't know if this github at its best or worst, but the original project seems dormant and there are several forks. According the original author, this one is probably the best.
git-issues Originally a rewrite of ticgit in python, it seems to to be active.

VCS Agnostic

ditz Despite my not caring about other VCSs, ditz is VCS agnostic, just making files. Seems active.
cil Takes a similar approach to ditz, is written in Perl rather than Ruby, and should release again any day now (hint, hint).

Sortof VCS Agnostic

bugs everywhere is written in python. Works with Arch, Bazaar, Darcs, Git, and Mercurial. Last commit to the development repo was about 6 months ago.
simple defects has Git and Darcs integration. It seems active. It's written by bestpractical people, so no surprise it is written in Perl.

I'm collecting information (or at least links) about functional programming languages on the the JVM. I'm going to intentionally leave "functional programming language" undefined here, so that people can have fun debating :). Functional Languages

Languages with functional features

Projects and rumours.

There has been discussion about making the jhc target the jvm. They both start with 'j', so that is hopeful.
Java itself may (soon? eventually?) support closures

You have a gitolite install on host $MASTER, and you want a mirror on $SLAVE. Here is one way to do that. $CLIENT is your workstation, that need not be the same as $MASTER or $SLAVE.

On $CLIENT, install gitolite on $SLAVE. It is ok to re-use your gitolite admin key here, but make sure you have both public and private key in .ssh, or confusion ensues. Note that when gitolite asks you to double check the "host gitolite" ssh stanza, you probably want to change hostname to $SLAVE, at least temporarily (if not, at least the checkout of the gitolite-admin repo will fail) You may want to copy .gitolite.rc from $MASTER when gitolite fires up an editor.
On $CLIENT copy the "gitolite" stanza of .ssh/config to gitolite-mirror to a stanza called e.g. gitolite-slave fix the hostname of the gitolite stanza so it points to $MASTER again.
On $MASTER, as gitolite user, make passphraseless ssh-key. Probably you should call it something like 'mirror'
Still on $MASTER. Add a stanza like the following to $gitolite_user/.ssh/config
```
 host gitolite-mirror
   hostname $SLAVE
   identityfile ~/.ssh/mirror
```
run ssh gitolite-mirror at least once to test and set up any "know_hosts" file.
On $CLIENT change directory to a checkout of gitolite admin from $MASTER. Make sure it is up to date with respect origin
```
git pull
```

Edit .git/config (or, in very recent git, use git remote seturl --push --add) so that remote origin looks like

fetch = +refs/heads/*:refs/remotes/origin/*
url = gitolite:gitolite-admin
pushurl = gitolite:gitolite-admin
pushurl = gitolite-slave:gitolite-admin

Add a stanza
```
repo @all
  RW+     = mirror
```
to the bottom of your gitolite.conf Add mirror.pub to keydir.
Now overwrite the gitolite-admin repo on $SLAVE git push -f Note that empty repos will be created on $SLAVE for every repo on $MASTER.
The following one line post-update hook to any repos you want mirrored (see the gitolite documentation for how to automate this) You should not modify the post update hook of the gitolite-admin repo. git push --mirror gitolite-mirror:$GL_REPO.git
Create repos as per normal in the gitolite-admin/conf/gitolite.conf. If you have set the auto post-update hook installation, then each repo will be mirrored. You should only push to $MASTER; any changes pushed to $SLAVE will be overwritten.

Recently I was asked how to read mps (old school linear programming input) files. I couldn't think of a completely off the shelf way to do, so I write a simple c program to use the glpk library. Of course in general you would want to do something other than print it out again.

So this is in some sense a nadir for shell scripting. 2 lines that do something out of 111. Mostly cargo-culted from cowpoke by ron, but much less fancy. rsbuild foo.dsc should do the trick.

#!/bin/sh
# Start a remote sbuild process via ssh. Based on cowpoke from devscripts.
# Copyright (c) 2007-9 Ron  <ron@debian.org>
# Copyright (c) David Bremner 2009 <david@tethera.net>
#
# Distributed according to Version 2 or later of the GNU GPL.
BUILDD_HOST=sbuild-host
BUILDD_DIR=var/sbuild   #relative to home directory
BUILDD_USER=""
DEBBUILDOPTS="DEB_BUILD_OPTIONS=\"parallel=3\""
BUILDD_ARCH="$(dpkg-architecture -qDEB_BUILD_ARCH 2>/dev/null)"
BUILDD_DIST="default"
usage()
 
    cat 1>&2 <<EOF
rsbuild [options] package.dsc
  Uploads a Debian source package to a remote host and builds it using sbuild.
  The following options are supported:
   --arch="arch"         Specify the Debian architecture(s) to build for.
   --dist="dist"         Specify the Debian distribution(s) to build for.
   --buildd="host"       Specify the remote host to build on.
   --buildd-user="name"  Specify the remote user to build as.
  The current default configuration is:
   BUILDD_HOST = $BUILDD_HOST
   BUILDD_USER = $BUILDD_USER
   BUILDD_ARCH = $BUILDD_ARCH
   BUILDD_DIST = $BUILDD_DIST
  The expected remote paths are:
  BUILDD_DIR  = $BUILDD_DIR
  sbuild must be configured on the build host.  You must have ssh
  access to the build host as BUILDD_USER if that is set, else as the
  user executing cowpoke or a user specified in your ssh config for
  '$BUILDD_HOST'.  That user must be able to execute sbuild.
EOF
    exit $1
 
PROGNAME="$(basename $0)"
version ()
 
    echo \
"This is $PROGNAME, version 0.0.0
This code is copyright 2007-9 by Ron <ron@debian.org>, all rights reserved.
Copyright 2009 by David Bremner <david@tethera.net>, all rights reserved.

This program comes with ABSOLUTELY NO WARRANTY.
You are free to redistribute this code under the terms of the
GNU General Public License, version 2 or later"
    exit 0
 
for arg; do
    case "$arg" in
        --arch=*)
            BUILDD_ARCH="$ arg#*= "
            ;;
        --dist=*)
            BUILDD_DIST="$ arg#*= "
            ;;
        --buildd=*)
            BUILDD_HOST="$ arg#*= "
            ;;
        --buildd-user=*)
            BUILDD_USER="$ arg#*= "
            ;;
        --dpkg-opts=*)
            DEBBUILDOPTS="DEB_BUILD_OPTIONS=\"$ arg#*= \""
            ;;
        *.dsc)
            DSC="$arg"
            ;;
        --help)
            usage 0
            ;;
        --version)
            version
            ;;
        *)
            echo "ERROR: unrecognised option '$arg'"
            usage 1
            ;;
    esac
done
dcmd rsync --verbose --checksum $DSC $BUILDD_USER$BUILDD_HOST:$BUILDD_DIR
ssh -t  $BUILDD_HOST "cd $BUILDD_DIR && $DEBBUILDOPTS sbuild --arch=$BUILDD_ARCH --dist=$BUILDD_DIST $DSC"

I am currently making a shared library out of some existing C code, for eventual inclusion in Debian. Because the author wasn't thinking about things like ABIs and APIs, the code is not too careful about what symbols it exports, and I decided clean up some of the more obviously private symbols exported. I wrote the following simple script because I got tired of running grep by hand. If you run it with

 grep-symbols symbolfile  *.c

It will print the symbols sorted by how many times they occur in the other arguments.

#!/usr/bin/perl
use strict;
use File::Slurp;
my $symfile=shift(@ARGV);
open SYMBOLS, "<$symfile"   die "$!";
# "parse" the symbols file
my %count=();
# skip first line;
$_=<SYMBOLS>;
while(<SYMBOLS>) 
  chomp();
  s/^\s*([^\@]+)\@.*$/$1/;
  $count $_ =0;
 
# check the rest of the command line arguments for matches against symbols. Omega(n^2), sigh.
foreach my $file (@ARGV) 
  my $string=read_file($file);
  foreach my $sym (keys %count) 
    if ($string =~ m/\b$sym\b/) 
      $count $sym ++;
     
   
 
print "Symbol\t Count\n";
foreach my $sym (sort  $count $a  <=> $count $b  (keys %count)) 
  print "$sym\t$count $sym \n";

Updated Thanks to Peter P schl for pointing out the file slurp should not be in the inner loop.

In order to have pretty highlighted oz code in HTML and TeX, I defined a simple language definition "oz.lang"

keyword = "andthen at attr case catch choice class cond",
          "declare define dis div do else elsecase ",
          "elseif elseof end fail false feat finally for",
          "from fun functor if import in local lock meth",
          "mod not of or orelse prepare proc prop raise",
          "require self skip then thread true try unit"
meta delim "<" ">"
cbracket = " "
comment start "%"
symbol = "~","*","(",")","-","+","=","[","]","#",":",
       ",",".","/","?","&","<",">","\ "
atom delim "'" "'"  escape "\\"
atom = '[a-z][[:alpha:][:digit:]]*'
variable delim " " " "  escape "\\"
variable = '[A-Z][[:alpha:][:digit:]]*'
string delim "\"" "\"" escape "\\"

The meta tags are so I can intersperse EBNF notation in with oz code. Unfortunately source-highlight seems a little braindead about e.g. environment variables, so I had to wrap the invocation in a script

#!/bin/sh
HLDIR=$HOME/config/source-highlight
source-highlight --style-file=$HLDIR/default.style --lang-map=$HLDIR/lang.map $*

The final pieces of the puzzle is a customized lang.map file that tells source-highlight to use "oz.lang" for "foo.oz" and a default.style file that defines highlighting for "meta" text.

So I have been getting used to madduck's workflow for topgit and debian packaging, and one thing that bugged me a bit was all the steps required to to build. I tend to build quite a lot when debugging, so I wrote up a quick and dirty script to

export a copy of the master branch somewhere
export the patches from topgit
invoke debuild

I don't claim this is anywhere ready production quality, but maybe it helps someone. Assumptions (that I remember)

you use the workflow above
you use pristine tar for your original tarballs
you invoke the script (I call it tg-debuild) from somewhere in your work tree

Here is the actual script:

#!/bin/sh
set -x
if [ x$1 = x-k ]; then
    keep=1
else
    keep=0
fi
WORKROOT=/tmp
WORKDIR= mktemp -d $WORKROOT/tg-debuild-XXXX 
# yes, this could be nicer
SOURCEPKG= dpkg-parsechangelog   grep ^Source:   sed 's/^Source:\s*//' 
UPSTREAM= dpkg-parsechangelog   grep ^Version:   sed -e 's/^Version:\s*//' -e s/-[^-]*// 
ORIG=$WORKDIR/$ SOURCEPKG _$ UPSTREAM .orig.tar.gz
pristine-tar checkout $ORIG
WORKTREE=$WORKDIR/$SOURCEPKG-$UPSTREAM
CDUP= git rev-parse --show-cdup 
GDPATH=$PWD/$CDUP/.git
DEST=$PWD/$CDUP/../build-area
git archive --prefix=$WORKTREE/ --format=tar master   tar xfP -
GIT_DIR=$GDPATH make -C $WORKTREE -f debian/rules tg-export
cd $WORKTREE && GIT_DIR=$GDPATH debuild 
if [ $?==0 -a -d $DEST ]; then
    cp $WORKDIR/*.deb $WORKDIR/*.dsc $WORKDIR/*.diff.gz $WORKDIR/*.changes $DEST
fi
if [ $keep = 0 ]; then
    rm -fr $WORKDIR
fi

Scenario You are maintaining a debian package with topgit. You have a topgit patch against version k and it is has been merged into upstream version m. You want to "disable" the topgit branch, so that patches are not auto-generated, but you are not brave enough to just

   tg delete feature/foo

You are brave enough to follow the instructions of a random blog post. Checking your patch has really been merged upstream This assumes that you tags upstream/j for version j.

git checkout feature/foo
git diff upstream/k

For each file foo.c modified in the output about, have a look at

git diff upstream/m foo.c

This kindof has to be a manual process, because upstream could easily have modified your patch (e.g. formatting). The semi-destructive way Suppose you really never want to see that topgit branch again.

git update-ref -d refs/topbases/feature/foo
git checkout master
git branch -M feature/foo merged/foo

The non-destructive way. After I worked out the above, I realized that all I had to do was make an explicit list of topgit branches that I wanted exported. One minor trick is that the setting seems to have to go before the include, like this

TG_BRANCHES=debian/bin-makefile debian/libtoolize-lib debian/test-makefile
-include /usr/share/topgit/tg2quilt.mk

Conclusions I'm not really sure which approach is best yet. I'm going to start with the non-destructive one and see how that goes. Updated Madduck points to a third, more sophisticated approach in Debian BTS.

I wanted to report a success story with topgit which is a rather new patch queue managment extension for git. If that sounds like gibberish to you, this is probably not the blog entry you are looking for. Some time ago I decided to migrate the debian packaging of bibutils to topgit. This is not a very complicated package, with 7 quilt patches applied to upstream source. Since I don't have any experience to go on, I decided to follow Martin 'madduck' Krafft's suggestion for workflow. It all looks a bit complicated (madduck will be the first to agree), but it forced me to think about which patches were intended to go upstream and which were not. At the end of the conversion I had 4 patches that were cleanly based on upstream, and (perhaps most importantly for lazy people like me), I could send them upstream with tg mail. I did that, and a few days later, Chris Putnam sent me a new upstream release incorporating all of those patches. Of course, now I have to package this new upstream release :-). The astute reader might complain that this is more about me developing half-decent workflow, and Chris being a great guy, than about any specific tool. That may be true, but one thing I have discovered since I started using git is that tools that encourage good workflow are very nice. Actually, before I started using git, I didn't even use the word workflow. So I just wanted to give a public thank you to pasky for writing topgit and to madduck for pushing it into debian, and thinking about debian packaging with topgit.

Recently I suggested to some students that they could use the Gnu Linear Programming Toolkit from C++. Shortly afterwards I thought I had better verify that I had not just sent people on a hopeless mission. To test things out, I decided to try using GLPK as part of an ongoing project with Lars Schewe The basic idea of this example is to use glpk to solve an integer program with row generation. The main hurdle (assuming you want to actually write object oriented c++) is how to make the glpk callback work in an object oriented way. Luckily glpk provides a pointer "info" that can be passed to the solver, and which is passed back to the callback routine. This can be used to keep track of what object is involved. Here is the class header

#ifndef GLPSOL_HH
#define GLPSOL_HH
#include "LP.hh"
#include "Vektor.hh"
#include "glpk.h"
#include "combinat.hh"
namespace mpc  
  class  GLPSol : public LP  
  private:
    glp_iocp parm;
    static Vektor<double> get_primal_sol(glp_prob *prob);
    static void callback(glp_tree *tree, void *info);
    static int output_handler(void *info, const char *s);
  protected:
    glp_prob *root;
  public:
    GLPSol(int columns);
    ~GLPSol()  ;
    virtual void rowgen(const Vektor<double> &candidate)  ;
    bool solve();
    bool add(const LinearConstraint &cnst);
   ;
 
#endif

The class LP is just an abstract base class (like an interface for java-heads) defining the add method. The method rowgen is virtual because it is intended to be overridden by a subclass if row generation is actually required. By default it does nothing. Notice that the callback method here is static; that means it is essentially a C function with a funny name. This will be the function that glpk calls when it wants from help.

#include <assert.h>
#include "GLPSol.hh"
#include "debug.hh"
namespace mpc 
  GLPSol::GLPSol(int columns)  
    // redirect logging to my handler
    glp_term_hook(output_handler,NULL);
    // make an LP problem
    root=glp_create_prob();
    glp_add_cols(root,columns);
    // all of my variables are binary, my objective function is always the same
    //  your milage may vary
    for (int j=1; j<=columns; j++) 
      glp_set_obj_coef(root,j,1.0);
      glp_set_col_kind(root,j,GLP_BV);
     
    glp_init_iocp(&parm);
    // here is the interesting bit; we pass the address of the current object
    // into glpk along with the callback function
    parm.cb_func=GLPSol::callback;
    parm.cb_info=this;
   
  int GLPSol::output_handler(void *info, const char *s) 
    DEBUG(1) << s;
    return 1;
   
  Vektor<double> GLPSol::get_primal_sol(glp_prob *prob) 
    Vektor<double> sol;
    assert(prob);
    for (int i=1; i<=glp_get_num_cols(prob); i++) 
      sol[i]=glp_get_col_prim(prob,i);
     
    return sol;
   
  // the callback function just figures out what object called glpk and forwards
  // the call. I happen to decode the solution into a more convenient form, but 
  // you can do what you like
  void GLPSol::callback(glp_tree *tree, void *info) 
    GLPSol *obj=(GLPSol *)info;
    assert(obj);
    switch(glp_ios_reason(tree)) 
    case GLP_IROWGEN:
      obj->rowgen(get_primal_sol(glp_ios_get_prob(tree)));
      break;
    default:
      break;
     
   
  bool GLPSol::solve(void)    
    int ret=glp_simplex(root,NULL);
    if (ret==0) 
      ret=glp_intopt(root,&parm);
    if (ret==0)
      return (glp_mip_status(root)==GLP_OPT);
    else
      return false;
   
  bool GLPSol::add(const LinearConstraint&cnst) 
    int next_row=glp_add_rows(root,1);
    // for mysterious reasons, glpk wants to index from 1
    int indices[cnst.size()+1];
    double coeff[cnst.size()+1];
    DEBUG(3) << "adding " << cnst << std::endl;
    int j=1;
    for (LinearConstraint::const_iterator p=cnst.begin();
         p!=cnst.end(); p++) 
      indices[j]=p->first;
      coeff[j]=(double)p->second;
      j++;
     
    int gtype=0;
    switch(cnst.type()) 
    case LIN_LEQ:
      gtype=GLP_UP;
      break;
    case LIN_GEQ:
      gtype=GLP_LO;
      break;
    default:
      gtype=GLP_FX;
     
    glp_set_row_bnds(root,next_row,gtype,       
                       (double)cnst.rhs(),(double)cnst.rhs());
    glp_set_mat_row(root,
                    next_row,
                    cnst.size(),
                    indices,
                    coeff);
    return true;

All this is a big waste of effort unless we actually do some row generation. I'm not especially proud of the crude rounding I do here, but it shows how to do it, and it does, eventually solve problems.

#include "OMGLPSol.hh"
#include "DualGraph.hh"
#include "CutIterator.hh"
#include "IntSet.hh"
namespace mpc 
  void OMGLPSol::rowgen(const Vektor<double>&candidate) 
    if (diameter<=0) 
      DEBUG(1) << "no path constraints to generate" << std::endl;
      return;
     
    DEBUG(3) << "Generating paths for " << candidate << std::endl;
  // this looks like a crude hack, which it is, but motivated by the
  // following: the boundary complex is determined only by the signs
  // of the bases, which we here represent as 0 for - and 1 for +
    Chirotope chi(*this);
    for (Vektor<double>::const_iterator p=candidate.begin();
         p!=candidate.end(); p++) 
      if (p->second > 0.5)  
        chi[p->first]=SIGN_POS;
        else  
        chi[p->first]=SIGN_NEG;
       
     
    BoundaryComplex bc(chi);
    DEBUG(3) << chi;
    DualGraph dg(bc);
    CutIterator pathins(*this,candidate);
    int paths_found=
      dg.all_paths(pathins,
                   IntSet::lex_set(elements(),rank()-1,source_facet),
                   IntSet::lex_set(elements(),rank()-1,sink_facet),
                   diameter-1);
    DEBUG(1) << "row generation found " << paths_found << " realized paths\n";
    DEBUG(1) << "effective cuts: " << pathins.effective() << std::endl;
   
  void OMGLPSol::get_solution(Chirotope &chi)  
    int nv=glp_get_num_cols(root);
    for(int i=1;i<=nv;++i)  
      int val=glp_mip_col_val(root,i);
      chi[i]=(val==0 ? SIGN_NEG : SIGN_POS);

So ignore the problem specific way I generate constraints, the key remaining piece of code is CutIterator which filters the generated constraints to make sure they actually cut off the candidate solution. This is crucial, because row generation must not add constraints in the case that it cannot improve the solution, because glpk assumes that if the user is generating cuts, the solver doesn't have to.

#ifndef PATH_CONSTRAINT_ITERATOR_HH
#define PATH_CONSTRAINT_ITERATOR_HH
#include "PathConstraint.hh"
#include "CNF.hh"
namespace mpc  
  class CutIterator : public std::iterator<std::output_iterator_tag,
                                                      void,
                                                      void,
                                                      void,
                                                      void> 
  private:
    LP& _list;
    Vektor<double> _sol;
    std::size_t _pcount;
    std::size_t _ccount;
  public:
    CutIterator (LP& list, const Vektor<double>& sol) : _list(list),_sol(sol), _pcount(0), _ccount(0)  
    CutIterator& operator=(const Path& p)  
      PathConstraint pc(p);
      _ccount+=pc.appendTo(_list,&_sol);
      _pcount++;
      if (_pcount %10000==0)  
        DEBUG(1) << _pcount << " paths generated" << std::endl;
       
      return *this;
     
    CutIterator& operator*()  return *this; 
    CutIterator& operator++()  return *this; 
    CutIterator& operator++(int)  return *this; 
    int effective()   return _ccount;  ;
   ;
 
#endif

Oh heck, another level of detail; the actual filtering actually happens in the appendTo method the PathConstraint class. This is just computing the dot product of two vectors. I would leave it as an exercise to the readier, but remember some fuzz is neccesary to to these kinds of comparisons with floating point numbers. Eventually, the decision is made by the following feasible method of the LinearConstraint class.

 bool feasible(const
Vektor<double> & x)  double sum=0; for (const_iterator
p=begin();p!=end(); p++)  sum+= p->second*x.at(p->first);  
      switch (type()) 
      case LIN_LEQ:
        return (sum <= _rhs+epsilon);
      case LIN_GEQ:
        return (sum >= _rhs-epsilon);
    default:
      return (sum <= _rhs+epsilon) &&
        (sum >= _rhs-epsilon);

I have been meaning to fix this up for a long time, but so far real work keeps getting in the way. The idea is that C-C t brings you to this week's time tracker buffer, and then you use (C-c C-x C-i/C-c C-x C-o) to start and stop timers. The only even slightly clever is stopping the timer and saving on quitting emacs, which I borrowed from the someone on the net. Oh, and one of the things I meant to fix was the dependence on mhc. Sorry. Here is the snippet from my .emacs

(require 'org-timetracker)
(setq   ott-file
  (expand-file-name 
    (let ((now (mhc-date-now)))
      (format "~/.org/%04d/%02d.org" 
         (mhc-date-yy now) (mhc-date-cw now)))))

It needs emacs restarted once a week, in order to pick up the new file name. The main guts of the hack are here. The result might look like the this (works better in emacs org-mode. C-c C-x C-d for a summary)

Search Results: "David Bremner"

11 December 2010

3 December 2010

31 October 2010

7 October 2010

19 August 2010

12 August 2010

4 August 2010

24 June 2010

30 March 2010

14 March 2010

6 March 2010

13 December 2009

29 November 2009

18 October 2009

3 February 2009

26 December 2008

24 December 2008

22 December 2008

3 December 2008

10 November 2008